Super-Sampling with a Reservoir

نویسندگان

  • Brooks Paige
  • Dino Sejdinovic
  • Frank D. Wood
چکیده

We introduce an alternative to reservoir sampling, a classic and popular algorithm for drawing a fixed-size subsample from streaming data in a single pass. Rather than draw a random sample, our approach performs an online optimization which aims to select the subset that provides the best overall approximation to the full data set, as judged using a kernel two-sample test. This produces subsets which minimize the worst-case relative error when computing expectations of functions in a specified function class, using just the samples from the subset. Kernel functions are approximated using random Fourier features, and the subset of samples itself is stored in a random projection tree. The resulting algorithm runs in a single pass through the whole data set, and has a per-iteration computational complexity logarithmic in the size of the subset. These “supersamples” subsampled from the full data provide a concise summary, as demonstrated empirically on mixture models and the MNIST dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Super operator Technique in Investigation of the Dynamics of a Two Non-Interacting Qubit System Coupled to a Thermal Reservoir

In this paper, we clarify the applicability of the super operator technique for describing the dissipative quantum dynamics of a system consists of two qubits coupled with a thermal bath at finite temperature. By using super operator technique, we solve the master equation and find the matrix elements of the density operator. Considering the qubits to be initially prepared in a general mixed st...

متن کامل

Shiga toxin producing Escherichia coli: identification of non-O157:H7-Super-Shedding cows and related risk factors

BACKGROUND Shiga toxin producing Escherichia coli (STEC) are an important cause of human gastro-enteritis and extraintestinal sequelae, with ruminants, especially cattle, as the major source of infection and reservoir. In this study, the fecal STEC shedding of 133 dairy cows was analyzed over a period of twelve months by monthly sampling with the aim to investigate shedding patterns and risk fa...

متن کامل

Feasibility Study of Network Hydraulic Fracture Applied to the Fissured Competent Sand Oil Reservoir

Chang 8 oil deposit, developed in Hohe and Jihe oil fields at the southern Yi-Shan Slop of Ordos Basin, is regarded as a kind of typical sand reservoir formation with super-low porosity, poor permeability, strong anisotropy as well as locally natural faults and fractures. The previous studies believed that matrix reservoir has a good permeability, whereas fracture reservoir has a reverse manner...

متن کامل

Effects of various super absorbent concentrations on runoff volume in slopes and various intensity of simulated rainfall in Shahrekord plain

Abstract In order to study the effect of super absorbent on runoff volume in slopes and various intensity of rainfall research was accomplish according to split – factorial blocks method with main treatment and two accessory treatments in three replicate . the main treatment consist of three dominant slopes (10 , 20 , 30 percent ) and accessory treatments consist of five levels of substance su...

متن کامل

A Case History on Integrated Fracture Modeling in a Giant Field

In this paper, a case study is used to demonstrate a straightforward methodology of faults, fractures and highpermeability layers integration in a single porosity single permeability (SPSP) reservoir simulation model. The application of this method in the Ghawar Arab-D reservoir indicated an adequate modeling of the water encroachment pattern. The described methodology starts with the identific...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016